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ABSTRACT 

In the early 'RNA world' stage of life, RNA stored 
genetic information and catalyzed chemical reac- 
tions. However, the RNA world eventually gave rise 
to the DNA-RNA-protein world, and this transition 
included the 'genetic takeover' of information stor- 
age by DNA. We investigated evolutionary advan- 
tages for using DNA as the genetic material. The 
error rate of replication imposes a fundamental limit 
on the amount of information that can be stored in 
the genome, as mutations degrade information. We 
compared misincorporation rates of RNA and DNA 
in experimental non-enzymatic polymerization and 
calculated the lowest possible error rates from a 
thermodynamic model. Both analyses found that 
RNA replication was intrinsically error-prone com- 
pared to DNA, suggesting that total genomic infor- 
mation could increase after the transition to DNA. 
Analysis of the transitional RNA/DNA hybrid du- 
plexes showed that copying RNA into DNA had 
similar fidelity to RNA replication, so information 
could be maintained during the genetic takeover. 
However, copying DNA into RNA was very error- 
prone, suggesting that attempts to return to the 
RNA world would result in a considerable loss of 
information. Therefore, the genetic takeover may 
have been driven by a combination of increased 
chemical stability, increased genome size and 
irreversibility. 

INTRODUCTION 

The RNA world theory posits that RNA fulfilled both of 
the major cellular functions, catalysis and information 
storage, during an early stage of life (1-3). RNA possesses 
the ability to store genetic information (e.g. in retroviruses), 



and RNA sequences can fold into complex structures, 
enabling enzymatic activity (ribozymes). The finding that 
the catalytic core of the ribosome comprises RNA lent 
considerable credence to the RNA world theory (4—6). 
This theory not only simplifies the origin of life by 
proposing a relatively uncomplicated replicating inter- 
mediate compared to a wholesale emergence of the tran- 
scription and translation machineries, but also implies 
that the RNA world then transitioned to the DNA- 
RNA-protein world. Our study is motivated by one of 
the central features of this transition: the 'genetic 
takeover' of RNA by DNA. 

The transition to DNA has previously been considered 
primarily from a chemical perspective. In particular, the 
DNA backbone is less prone to hydrolysis, since it lacks 
the nucleophilic 2'-hydroxyl group, so it represents a more 
chemically stable genetic material. In this work, we 
consider a genetic perspective: does DNA replicate with 
greater intrinsic fidelity, thus allowing more information 
to be stored? Would information be lost during the genetic 
takeover or during a hypothetical reversion back to RNA? 
RNA viruses generally have higher mutation rates than 
DNA viruses (7,8), but it is unclear whether this is due 
to replication mechanisms in place today (e.g. DNA 
proofreading enzymes), natural selection [e.g. on 
evolvability (9)] or intrinsic properties of the nucleic acid 
backbones. 

The mutation rate during replication places an import- 
ant constraint on the amount of information that can be 
stored in the genome. Intuitively, the information degrades 
over subsequent generations if mutations are too frequent 
(an error 'catastrophe'). In general, theoretical models 
indicate that the maximum genomic information content 
is inversely proportional to the mutation rate per base 
(10,11). The critical mutation rate above which the 
genomic information cannot survive is known as the 
error threshold (12). This relationship appears to hold 
for viruses, especially RNA viruses, which exist close to 
the error threshold of roughly one mutation per genome 
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replication (13). Indeed, increasing the mutation rate to 
precipitate an error catastrophe appears to be a practical 
anti-retro viral strategy (14-16). The constraint on infor- 
mation is also a serious consideration for the early stages 
of life that were characterized by primitive replication 
mechanisms; mutation rates would have been quite high 
and would substantially limit the information content of 
the system (17-19). 

Therefore, we sought to compare the intrinsic error sus- 
ceptibility of RNA replication, DNA replication, copying 
RNA to DNA and copying DNA to RNA, using two 
strategies. First, we determined misincorporation rates in 
an experimental model of non-enzymatic polymerization. 
Although the polymerization chemistry present during the 
genetic takeover is not known, for these experiments we 
use 5'phosphorimidazolides as activated monomers and 
primers terminated by a 3'-amino-2',3'-dideoxynucleotide; 
this system is capable of rapid polymerization, such 
that relatively slow rates of mis-incorporation can be 
measured (18,20-24). Similar systems have been previ- 
ously used to demonstrate replication in model protocells 
(25,26). 

Second, since experimental rates presumably depend 
on kinetic effects (e.g. the activation chemistry in non- 
enzymatic polymerization, or ribozyme mechanisms 
involved in catalyzed polymerization), we also attempted 
to estimate the lowest possible error rates achievable by 
any experimental system using equilibrium thermodynam- 
ic calculations (Figure 1). To our knowledge, this is the 
first comprehensive comparison of experimental and the- 
oretical error rates and the first analysis of error rates in 
RNA/DNA hybrid duplexes. Both strategies showed that 
RNA is intrinsically error-prone compared to DNA. This 
effect can be attributed largely to the stability of G-U 
wobble pairs in RNA, which leads to a large increase in 
the frequency of mis-incorporation, particularly across U, 
in RNA compared to DNA. Characterization of the 
RNA-DNA hybrid systems showed that while RNA can 
be copied into the DNA complement fairly accurately, 
copying DNA back to an RNA complement would be 
quite inaccurate. These results suggest that information 
transfer to DNA would permit an increase in genomic 
information content as the mutation rate decreased, and 
that this transition would be essentially irreversible 
since copying back to RNA would be an error-ridden 
process. 



MATERIALS AND METHODS 

Synthesis of activated nucleotides 

All deoxynucleoside 5'-phosphorimidazolides (ImpdN) 
and nucleoside 5' phosphorimidazolides (ImpN) were 
synthesized based on a previously published protocol 
(18,27,28) (Supplementary Data S4). ImpA, ImpC and 
ImpU were synthesized by GL Synthesis Inc. 
(Worcester, MA, USA). Activated nucleotides were veri- 
fied by mass spectrometry and high-performance liquid 
chromatography (HPLC) as previously described (18) 
and were found to be >93% pure. 



Oligonucleotides for non-enzymatic polymerization 

The fluorescently labeled RNA primer was made by 
reverse synthesis in the W. M. Keck Biotechnology 
Resource Laboratory at Yale University (New Haven, 
CT, USA). The synthesis used 3'-0-tritylamino- 
M5-benzoyl-2',3'-dideoxyguanosine-5'-cyanoethyl phos- 
phoramidite (Metkinen Chemistry; Kuusisto, Finland) at 
the 3' -terminus and was labeled with Cy3 at the 
5'-terminus. The primer was polyacrylamide gel electro- 
phoresis (PAGE)-purified and its mass was verified by 
matrix-assisted laser desorption/ionization-time of flight 
(MALDI-TOF) (Supplementary Data S5). The fluor- 
escently labeled DNA primer was synthesized as previous- 
ly described (18). DNA oligonucleotides were synthesized 
and PAGE-purified by Sigma-Aldrich (St. Louis, MO, 
USA). RNA template sequences were from Dharmacon 
(Lafayette, CO, USA) and RNA excess primer was from 
UCDNA Services (Calgary, AB, Canada). See 
Supplementary Data S6 for oligonucleotide sequences. 

Non-enzymatic polymerization 

For observation on a laboratory timescale, the polymer- 
ization reaction required monomers activated at the 5' 
position for incorporation. The activated monomer was 
a nucleoside 5'-phosphorimidazolide (ImpN) if the primer 
backbone was RNA, or a 2'-deoxynucleoside 5'-phospho- 
rimidazolide (ImpdN) if the primer backbone was DNA. 
Templates were standard RNA or DNA, and primers 
were RNA or DNA with the exception that they were 
terminated by a single 3'-amino-2',3'-dideoxynucleotide 
at the 3' end. In all reactions, the template and primer 
were perfectly complementary at the beginning of the ex- 
periment, and each reaction was performed at least in du- 
plicate. Extension was undetectable in the absence of 
template. 

Primer extension reactions were carried out as previous- 
ly described (18) (Supplementary Data S7), with the 
template and primer backbones varied to be either DNA 
or RNA. A primer (0.325 pM) and a template (1.3 pM) 
(1 pi each) were mixed in water, incubated at 95°C for 
5min, and annealed by cooling to room temperature on 
a benchtop for 5-7 min. In a reaction of 10 pi volume, 1 pi 
of 1 M Tris (pH 7) and 0.5 pi of 4 M NaCl were added to 
final concentrations of lOOmM Tris and 200 mM NaCl. 
For reactions with ImpdN/ImpN, where N = A, C, or G, 
the reaction was initiated by the addition of 1 pi of 
100 mM ImpdN to a final concentration of 10 mM. For 
reactions involving ImpdT/ImpU, 1.38 pi of 289 mM stock 
solution was added to a final concentration of 40 mM. The 
reaction mixtures were incubated at room temperature, 
and aliquots were withdrawn at certain time points. For 
matched (Watson-Crick) reactions between the nucleotide 
and the template, time points were taken at 0.5 min, 1 min, 
3 min, 7.5 min, 15 min, 30 min, 1 h, 2h and 4h. For mis- 
matched reactions, time points were taken at 1 min, 
7.5min, 15min, 30min, 1 h, 2h, 4h, 8h and 24 h. A 
negative control was taken before adding the ImpdN/ 
ImpN in each reaction. For reactions using a DNA 
primer, time points were obtained by adding 1 pi of the 
reaction mixture to 9 pi of the loading buffer with 8 M 
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Figure 1. Experimental and theoretical approaches to determining error rates. (A) Comparison of the experimental reaction rate of correct incorp- 
oration (left duplex) versus incorrect incorporation (right duplex). Template strand (right strand within duplex) is either DNA (1) or RNA (2). 
Primer strand (left strand within duplex) and activated nucleotide are either both DNA analogs (1) or both RNA analogs (2). (B) The free energy of 
the full-length duplex is calculated as a sum of independent contributions from stacking interactions and other simple structural elements (e.g. 
mismatches, bulges). Shown here is the comparison between a correctly matched product (top) using nearest-neighbor interactions (cyan and orange 
boxes) versus a product containing a single mismatch (bottom; green box). This comparison was used as a theoretical estimate for the lowest possible 
error rate. (C) Example of determination of experimental reaction rate: mis-incorporation of activated nucleotide (ImpdG; red) across template base 
G in RNA-templated DNA polymerization (blue). Gel image shows extension over time of the original primer (n) by one base (» + 1). Decrease of 
primer over time is plotted (initial rate = 0.24/h); the line is drawn to guide the eye. Initial rates were used because the reaction slows noticeably over 
time such that the yield is < 1 00 % ; this is likely due to spontaneous hydrolysis of the activated monomer under the reaction conditions. 



urea, lOOmM ethylenediaminetetraacetic acid (EDTA), 
and 1.3 uM of a competitor DNA with the sequence: 5' 
GG GAT TAA TAC GAC TCA CTN 3', where N = A/ 
T/G/C to match the primer employed in the reaction. For 
the RNA primer reactions, 65 iiM of a competitor RNA 
with the sequence: 5' GG GAU UAA UAC GAC UCA 
CUN 3' was used instead of DNA. Time points were 
heated to 95°C for 5min to disrupt primer-template 
complexes and were run on 20% denaturing PAGE. The 
initial rate of disappearance of primer was calculated. 
Examples of slow reactions are given in Supplementary 
Data (Supplementary Figure S2). 



Calculation of experimental error rates 

The frequency of incorporation (/) of a particular nucleo- 
tide a' across a template base was calculated by dividing its 
rate of extension (r) by the sum of the rates of extension 
for all four ImpdNs or ImpNs across that template base: 



a 

where n denotes the correct nucleotide complementary to 
the template base, and a ranges over all four nucleotides. 
If a' = n, then f n n is also called the fidelity. If d ^ «, 
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then f n c/ is the error rate per site for d incorporated 
instead of n, oru„ The mutation rate per site across 
a given template base B (|i B ) is E(u„ ,„), where m^n. If 
the proportion of the genome composed of base X is -Px? 
then the average mutation rate of a genome (u ave ) is 
E(_P x ux)- Since most aptamers and ribozymes have 
roughly even composition [22-28% of each nucleotide; 
(29,30)], we assume the genome has even composition. 
Therefore 



1 



Ha 



E^ = zEE 



(1) 



B nim 



Thermodynamic estimate of lower limit of the error rate 

Using Equations 2 and 3 (Supplementary Data S8), the 
lower limit on the mutation rate is: 



9 —AGn.m/RT 



(2) 



where AG„ „, is calculated as given below. Equation 2 can 
be generalized for four nucleotides (the correct nucleotide 
n and three erroneous nucleotides m) and non-equimolar 
nucleotide concentrations (10) as follows: 



Hn,m(b) > M° (b) = 



[/»] 



EM 



o-(AGn.a- t\Gn.m)/ RT 



(3) 



where b is the vector of nucleotide concentrations and a 
ranges over all four nucleotides. 

Calculation of nucleotide concentrations to minimize the 
mutation rate 

The mutation rate is given by Equation 1, which is a 
function of b. The optimal nucleotide concentrations 
b* are obtained by simultaneous numerical minimiza- 
tion of Equation 1 with respect to b, given first-order 
kinetics with respect to the nucleotide concentration 
[i.e. r„ „/ is proportional to the concentration of {a')] 
(Supplementary Data S9). To avoid large discrepancies 
among the optimized concentrations, we constrained all 
concentrations to be within a factor of 10 of each other. 
We calculated the optimal nucleotide supply using either 
the experimentally measured rates or the thermodynamic 
lower limits for the error rates, as given in Equation 3. 

Thermodynamic calculation of free energy differences 

The equilibrium probability of incorporating a certain nu- 
cleotide in a larger complex should include interactions 
with its 5' and 3' neighbors (Figure IB). To estimate 
AG nm , we use the nearest-neighbor model for predicting 
RNA and DNA duplex stabilities (31,32,33). For the 
example in Figure IB, 

AG n ^ m = AAG„,i smatc i, UAC/GGA 

— (AAGstack, UA/UA + AAG smc k, ac/gu)- 

We used energy parameters given in the literature for 
stacking and mismatches for the RNAt/RNAp and 



DNAt/DNAp systems, extrapolated to the experi- 
mental temperature of 22°C as described (34,32). These 
energy terms have an error of around 5-10% (32,33,35). 
Naive error propagation implies that the relative 
uncertainties of the calculated mutation rates would be 
30-60%; however, an accurate estimate of the uncer- 
tainty is also complicated by the fact that the energy par- 
ameters had been obtained by multivariate fitting of 
experimental data, so their values are likely to be highly 
correlated with each other. For the RNA/DNA hybrids, 
we used published stacking energies when available (35) 
or estimated them when not available (Supplementary 
Data S10). 



RESULTS 

To estimate the error rates (i.e. frequency of an error per 
residue) of non-enzymatic polymerization in the different 
nucleic acid systems, we used two approaches: measure- 
ments of mis-incorporation in an experimental model 
(Figure 1A) and calculations based on the thermodynamic 
differences between correct and incorrect incorporation 
(Figure IB). Experimentally measured reaction rates 
yield straightforward estimates of error rates. However, 
we do not know whether the activation chemistry used 
in our experimental model is a good mimic of the chem- 
istry of prebiotic replication. Therefore, we also sought to 
infer theoretical error rates based on the relative thermo- 
dynamic stability of correctly versus incorrectly 
paired complexes. These calculations estimate the theoret- 
ical lower limits for the error rates from the thermo- 
dynamics of RNA and DNA base pairing, which may or 
may not correlate with experimental rates that are kinet- 
ically determined. This combination of theory and experi- 
ment permitted us to cross-validate trends in the two 
separate sets of error rates and to quantitatively test 
how closely the experimental system approaches the the- 
oretical limit. We first present the experimental results and 
then describe the theoretical calculations and their rela- 
tionship to the experimental results. 

We measured experimental reaction rates of non- 
enzymatic nucleic acid polymerization using a model sys- 
tem for template-directed replication (Figure 1C; Table 1; 
Supplementary Figure SI). Misincorporation rates were 
determined by comparing the rate of incorporation for 
the correct (Watson-Crick) base versus an incorrect 
base (Figure 1A; Table 2). The misincorporation rates 
and overall mutation rate for copying DNA into DNA 
had been previously determined (18); here, we deter- 
mined the corresponding rates for copying RNA into 
RNA or DNA and for copying DNA into RNA using 
the same activation chemistry (5'-phosphorimidazolide 
nucleotides with primers terminated by a 2',3'-dideoxy- 
3'-amino nucleoside at the 3' end). The average mutation 
rate (|i ave ) is given as the probability of a mutation (any 
error) per site. We refer to the different nucleic acid 
systems as '(template backbone)t/(primer backbone)p', 
e.g. RNAt/DNAp designates an RNA template and 
DNA primer. 
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Non-enzymatic replication of RNA has low intrinsic 
fidelity compared to DNA 

The experimental mutation rate of non-enzymatic poly- 
merization on a DNA backbone with a DNA primer was 
measured in previous work [u ave (DNAt/DNAp) = 
7.6 ± 1.4%; Figure 2A] (18). We found that non- 
enzymatic polymerization on an RNA backbone with an 
RNA primer under analogous reaction conditions was 
more than twice as error-prone [Figure 2B; Li ave (RNAt/ 
RNAp) = 16.8 ± 1.6%]. The correct Watson-Crick nu- 
cleotide incorporated across each template base with the 
highest rate. As in the DNA system, C and G templated 
with relatively high fidelity compared with A and U. 
However, mis-incorporation of G across U was a very 
prominent source of error in the RNA system, such that 
U templated with a fidelity of only 55 ± 7%, compared to 



Table 1. Experimental reaction rates (r, h ') and standard deviation 

(«T) 



Template ImpN or RNAt/RNAp RNAt/DNAp DNAt/RNAp 

ImpdN 

/• a r a r a 



c 






G 






2.3 


0.08 


1.6 


1.2 


3.8 


0.47 


c 






T 


or 


U 


0.0078 


0.0021 


0.0071 


0.0057 


0.021 


0.0003 


c 






C 






0.0035 


0.001 


0.00085 


0.00007 


0.022 


0.0072 


c 






A 






0.0075 


0.0013 


0.0080 


0.0029 


0.032 


0.0049 


G 






C 






20 


2.2 


2.5 


0.7 


1.1 


0.42 


G 






T 


or 


U 


0.91 


0.23 


0.75 


0.13 


0.29 


0.11 


G 






G 






0.25 


0.02 


0.25 


0.003 


0.11 


0.0004 


G 






A 






0.035 


0.008 


0.034 


0.015 


0.031 


0.0036 


A 






T 


or 


u 


1.1 


0.16 


1.6 


0.13 


0.47 


0.023 


A 






C 






0.078 


0.029 


0.022 


0.0045 


0.029 


0.006 


A 






G 






0.12 


0.01 


0.52 


0.02 


0.031 


0.0005 


A 






A 






0.016 


0.0008 


0.016 


0.0075 


0.029 


0.005 


T 


or 


U 


A 






0.55 


0.26 


0.30 


0.052 


0.093 


0.026 


T 


or 


U 


C 






0.011 


0.004 


0.0081 


0.0037 


0.019 


0.0037 


T 


or 


u 


G 






0.35 


0.08 


0.022 


0.0011 


0.065 


0.0093 


T 


or 


u 


T 


or 


u 


0.073 


0.019 


0.029 


0.0004 


0.064 


0.0031 



85 ± 2% for T templating in the DNA system. 
Mis-incorporation of U was also the dominant error 
across G in RNA, unlike G in DNA which did not have 
a dominant error; nevertheless, the absolute fidelity across 
G was relatively good (94.9 ± 0.2% in the DNA system; 
94.5 ± 0.6% in the RNA system). 

RNA is copied into DNA complement with similar 
fidelity as RNA into RNA 

As an experimental proxy for the transfer fidelity of 
genetic information from RNA to DNA, we measured 
the experimental misincorporation rate using an RNA 
template and DNA primer with 2'-deoxy activated nucleo- 
tides. The misincorporation rate of this system was similar 
to that for RNA/RNA replication: n ave (RNAt/DNAp) = 
18.1 ± 0.4% (Figure 2C). In contrast to the pure RNA 
and DNA systems, in this hybrid G templated with 
quite low fidelity (71 ± 3%), primarily from mis- 
incorporation of T. Also, in contrast to the dominant 
mis-incorporation of G across U in the pure RNA 
system, mis-incorporation of G across T did not occur 
at a high rate. Instead, most errors came from mis- 
incorporation of T across G and G across A, which cor- 
respond to non-Watson-Crick base pairs. In this system, 
only C templated relatively faithfully (fidelity of 
99.4 ± 0.1%), while mutations across G, A, and U were 
quite frequent (fidelities of 71 ±3, 75 ± 2, 83 ± 3%, 
respectively). 

DNA is copied into RNA complement with low intrinsic 
fidelity 

In the DNAt/RNAp experimental system, the mis- 
incorporation rate was found to be very high (27 ± 3%; 
Figure 2D). In particular, the misincorporation rate in this 
system was much higher (>3x) than that of pure DNA 
replication. As with the other systems, C templated with 
good fidelity (98 ± 0.3%). However, like the RNA- 
templated reactions, this system suffered from a high 



Table 2. Theoretical (^theory) an d experimental (u exp ) incorporation and mis-incorporation frequencies with standard deviation determined from 
replicates (a exp ) 



Template ImpN or DNAt/DNAp RNAt/RNAp RNAt/DNAp DNAt/RNAp 
ImpdN 















M- theory 


r-^exp 


^exp 


M'theory 


Hexp 


^exp 


M- theory 


r-^exp 


*-^exp 


^theory 


M-exp 


^exp 


c 






G 






0.9998 


0.9939 


2.4E-03 


1.0000 


0.9917 


8.0E-04 


0.9998 


0.9939 


l.OE-03 


0.9999 


0.9803 


2.9E-03 


c 






T 


or 


U 


1.6E-04 


3.1E-03 


1.6E-03 


2.9E-05 


3.4E-03 


8.2E-04 


1.6E-04 


1.9E-03 


4.0E-04 


5.8E-05 


5.5E-03 


6.0E-04 


c 






C 






3.6E-05 


7.8E-04 


2.3E-04 


7.2E-06 


1.6E-03 


4.9E-04 


3.8E-05 


4.3E-04 


1.7E-04 


1.4E-05 


5.7E-03 


2.6E-03 


c 






A 






1.7E-05 


2.2E-03 


6.1E-04 


7.2E-06 


3.3E-03 


4.8E-04 


2.7E-05 


3.8E-03 


4.6E-04 


9.5E-06 


8.4E-03 


2.7E-04 


G 






C 






0.9932 


0.9487 


1.5E-03 


0.9628 


0.9435 


5.6E-03 


0.9844 


0.7057 


3.4E-02 


0.9486 


0.7208 


2.6E-02 


G 






T 


or 


U 


4.1E-03 


1.9E-02 


1.1E-03 


3.7E-02 


4.3E-02 


6.1E-03 


1.5E-02 


2.1E-01 


1.2E-02 


5.1E-02 


1.8E-01 


8.5E-03 


G 






G 






2.4E-03 


1.5E-02 


4.2E-03 


7.4E-06 


1.2E-02 


6.2E-04 


1.7E-04 


7.2E-02 


1.6E-02 


5.5E-04 


7.4E-02 


2.5E-02 


G 






A 






2.0E-04 


1.7E-02 


6.8E-03 


7.4E-06 


1.7E-03 


2.0E-04 


4.8E-05 


1.0E-02 


6.6E-03 


1.6E-04 


2.1E-02 


9.4E-03 


A 






T 


or 


u 


0.9965 


0.9058 


4.7E-02 


0.9997 


0.8397 


7.7E-03 


0.9993 


0.7458 


1.8E-02 


0.997 


0.8422 


2.3E-02 


A 






C 






2.2E-04 


1.8E-02 


6.0E-03 


8.4E-05 


5.7E-02 


1.4E-02 


1.2E-04 


1.0E-02 


1.5E-03 


4.9E-04 


5.2E-02 


1.2E-02 


A 






G 






2.8E-03 


4.6E-02 


2.4E-02 


8.4E-05 


9.1E-02 


2.0E-02 


4.1E-04 


2.4E-01 


2.3E-02 


1.7E-03 


5.5E-02 


2.8E-04 


A 






A 






5.5E-04 


3.1E-02 


1.7E-02 


8.4E-05 


1.2E-02 


1.0E-03 


1.8E-04 


7.1E-03 


3.0E-03 


7.8E-04 


5.1E-02 


1.1E-02 


T 


or 


U 


A 






0.9921 


0.8471 


1.5E-02 


0.8845 


0.5515 


6.7E-02 


0.9323 


0.8298 


3.0E-02 


0.955 


0.3819 


7.1E-02 


T 


or 


U 


C 






5.2E-04 


1.6E-02 


9.3E-03 


2.6E-04 


1.3E-02 


8.9E-03 


1.2E-03 


2.9E-02 


6.7E-03 


8.2E-04 


7.9E-02 


7.9E-03 


T 


or 


u 
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Figure 2. Experimentally observed frequency of incorporation and mis-incorporation for copying (A) DNA into DNA, (B) RNA into RNA, 
(C) RNA into DNA, (D) DNA into RNA. Activated nucleotides (ImpN or ImpdN): red = G, blue = C, purple = U or T, green = A. Panel (A) 
is modified from (18). Substantial frequencies are labeled directly with the incorporated nucleotide in bold italic font within the bar graph. 



rate of mis-incorporation corresponding to the G:U or 
G:T wobble pair (mis-incorporation of G across T 
occurred at a rate of 27 ± 7%; mis-incorporation of U 
across G occurred at a rate of 18 ± 1%). In addition, 
the mis-incorporation of U across T was also a prominent 
error, occurring at a rate of 27 ± 1%. As a result, errors 
across T were especially frequent, such that the fidelity of 
copying T was only 38 ± 7%. 

Correlation between misincorporation rates in 
non-enzymatic RNA replication and ribozyme-catalyzed 
RNA replication 

The misincorporation rates of the experimental non- 
enzymatic RNAt/RNAp system correlate with those 
observed previously for a ribozyme-catalyzed system 



(r 2 = 0.75) (17), although the ribozyme system had a 
lower overall [i ave of 3.3% (Supplementary Figure S2). 
As in the non-enzymatic system, wobble pairing is re- 
sponsible for the bulk of the errors in the ribozyme 
system, indicating that this is a feature of the RNA 
backbone. 

Theoretical basis for thermodynamic estimation of error 
rates 

The conceptual basis for the theoretical approach is 
the well-established physico-chemical description of sub- 
strate discrimination and reading errors (10,36,37), in 
which the lower limit of the error rate is determined by 
the free energy difference between the correct versus 
incorrect products. This limitation is explicit in a 
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Michaelis-Menten scheme describing the non-enzymatic 
elongation of an RNA or DNA primer across an RNA 
or DNA template: 



n,on v 

c + n s [en] - 



->c' 



(4) 



c + m ~ 



: [cm] ■ 



m,off 



where the primer-template complex c is either elongated 
by the correct Watson-Crick complementary nucleotide n 
or by a non-complementary base m. In this scheme, dis- 
crimination between n and m occurs in the nucleotide 
docking step to the reaction intermediates [cn] and [cm], 
as the incorrect base m has association (k m „„) and dissoci- 
ation {k m<0 fj) rates that differ from those of the correct 
base. The intermediates are converted to the correctly elo- 
ngated template-primer complex d or the erroneous prod- 
uct c" via phosphodiester bond formation at rate W n or 
W m , respectively. Assuming that the different nucleotides 
are available at equal concentrations, Equation 4 results in 
an error ratio 0 (rate of incorrect product formation 
divided by rate of correct product formation) given by: 



4> 



kn.qff+WjK 



k m , 0 ff+W„ k, h 



The associated mutation rate fx, is: 



I 1 



1+0' 



(5) 



The error ratio is minimal when bond formation is 
much slower than unbinding. In this limit, the error 
ratio approaches the ratio of the equilibrium binding con- 
stants, resulting in the thermodynamic lower bound 



-AGn, m/RT 



(6) 



where AG nm is the free energy difference between the 
correct and incorrect product, R is the gas constant and 
T is the temperature. See 'Materials and Methods' section 
for details. 

Thermodynamic estimates from nearest-neighbor 
interactions correlate with experimentally 
observed misincorporation rates 

We estimated the lower theoretical limit for frequencies of 
each possible error in the four systems (DNAt/DNAp, 
RNAt/RNAp, RNAt/DNAp and DNAt/RNAp) using 
energetic calculations from a nearest-neighbor model 
based on the established free energy rules for DNA and 
RNA secondary structure formation (Figure IB; Table 2) 
(34,31). The calculated frequencies correspond to a hypo- 
thetical situation in which the four possible fully elongated 
products of a given template (one correct and three erro- 
neous primer-template complexes) would be allowed to 
reach thermodynamic equilibrium. The thermodynamic 
predictions correlated well with the experimentally 
observed error rates (Figure 3A-D). The thermodynamic 
calculations do appear to represent a lower limit to the 
experimentally observed rates, which were usually greater 



by one to three orders of magnitude. This discrepancy is 
probably due at least in part to the non-equilibrium 
conditions and lack of downstream incorporation in the 
experiments (see 'Discussion' section). Interestingly, the 
equilibrium model predicted the high frequency of 
mis-incorporation of G:U-type wobble pairs quite well 
(ratio of experimental to theoretical rates was <10). 

Thermodynamic estimates follow the same trends as 
experimental mutation rates 

To determine whether the trends we found comparing the 
DNA, RNA and hybrid systems in our experiments were 
intrinsic to the nucleic acid backbones versus heavily 
influenced by the activation chemistry, we looked for the 
same trends in our thermodynamic calculations. As in the 
experimental system, our thermodynamic calculations 
indicate that DNA replication (theoretical u ave = 0.5%) 
is intrinsically more faithful than RNA replication (theor- 
etical n aV e = 3.8%). In addition, copying RNA into DNA 
(theoretical (i ave = 2.1%) was about as faithful as RNA 
replication. Copying DNA into RNA was error-prone 
(theoretical (i ave = 2.5%) compared to pure DNA replica- 
tion. These relationships verified the major trends 
observed in our experimental system. The error of these 
estimated mutation rates would be due to errors in the 
energy parameters on which these calculations are based, 
which have uncertainties of 5-10% (see 'Materials and 
Methods' section). 

Alternative nucleotide ratios 

Because polymerization is apparently first order with 
respect to nucleotide concentration, u ave depends on the 
ratios of nucleotide concentrations. To match conditions 
for previously published data (18), actual experimental 
conditions were [A] = [C] = [G] = lOmM, [T or 
U] = 40 mM, which were also used for the thermodynamic 
calculations above. We calculated the expected mutation 
rate in two additional conditions of interest: (i) equimolar 
nucleotide supply and (ii) an optimal nucleotide ratio that 
would minimize the mutation rate (Table 3). In (ii), we 
wondered whether very high rates of particular errors 
could be countered by adjustments in the nucleotide 
pool. We calculated the optimal nucleotide supply using 
either the experimentally determined error rates or the 
thermodynamic estimates (Figure 3E and F; Table 4). 
To avoid large discrepancies among the nucleotide con- 
centrations, we also constrained them to be within 10-fold 
of each other. For all systems, reducing the concentration 
of G would improve fidelity, essentially because mis- 
incorporation of G tended to be a major source of error 
while the correct incorporation of G across template C 
was already very efficient (Figure 2). 

The major trends, that RNA replication is error-prone 
compared to DNA replication, that RNA is copied into 
the DNA complement with similar fidelity as RNA repli- 
cation and that DNA is copied into the RNA complement 
with a relatively high mutation rate, also held for equi- 
molar conditions and optimized conditions (Table 3). 
Copying DNA into RNA was still very error-prone 
compared to the other systems, suggesting that this 
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Figure 3. Experimental incorporation and mis-incorporation frequencies versus thermodynamic predictions for copying (A) DNA into DNA, (B) 
RNA into RNA, (C) RNA into DNA, (D) DNA into RNA. Lines: linear regression on the log values; r 2 values as given. Error bars are from 
experimental replicates. The gray zones represent the areas in which the experimental frequency would be less than the theoretical frequency; correct 
incorporations (upper right corner) should lie within the gray zone (i.e. observed fidelity is less than the theoretical maximum) while 
mis-incorporations should lie outside the gray zone (i.e. observed error frequencies are greater than the theoretical minimum). Nucleotide supply 
for minimizing experimental (E) or theoretical (F) mutation rates. See Table 4 for values and experimental error bars. Systems are denoted as 
template/primer. Red = A, orange = C, green = G, blue = T or U. 



Table 3. Mutation rates (u ave ) predicted for equimolar or optimized 
nucleotide ratios, based on either experimental rates or thermodynam- 
ic lower bounds 





Rates from: 


Equimolar ratios 


Optimized ratios 


Experiment Theory 


Experiment Theory 


Template 


Primer 






RNA 


RNA 


22 ±1% 3% 


8.4 ± 0.9% 0.5% 


DNA 


DNA 


11 ±4% 0.6% 


5.8 ± 0.8% 0.2% 


RNA 


DNA 


21.5 ±0.4% 2% 


9.4 ± .05% 0.4% 


DNA 


RNA 


28 ± 4% 2% 


21 ± 3% 0.6% 



Error bars given are standard deviations from calculations based on 
duplicate batches of experimental mutation rates. 



process could not be made as faithful as the other systems 
through optimization of the nucleotide supply alone. 



DISCUSSION 

In our experiments, non-enzymatic RNA polymerization 
had about twice the misincorporation rate of DNA poly- 
merization, suggesting that more information could be 
stably encoded after the switch to DNA as the genetic 
material. This might translate into roughly a doubling of 
genome information. In addition to the increased chemical 
stability of DNA, the potential to increase information 
content might present another selective advantage to an 



organism that made this transition. We also studied the 
RNAt/DNAp and DNAt/RNAp hybrids as exemplars of 
transitional forms, although the transitions could involve 
more complicated mixed backbones in reality. Using these 
exemplars, we found that copying RNA into DNA 
occurred with a mutation rate similar to RNA replication, 
suggesting that the genetic takeover itself would not cause 
much loss of information. In contrast, copying DNA back 
into RNA was a highly error-prone process, suggesting 
that an organism that attempted to switch from DNA 
back to RNA would be at an immediate disadvantage 
from the corruption of genetic information (Figure 4). It 
should be noted that the genetic takeover of the RNA 
world did not necessarily proceed directly to DNA, but 
might have proceeded through intermediate stages con- 
taining alternative nucleic acid backbones. If that were 
the case, one may not draw conclusions regarding the re- 
versibility of the genetic takeover from our results, 
although the difference between RNA and DNA replica- 
tion would still be relevant. 

The experimental system used here is only a laboratory 
model for nucleic acid polymerization without enzymes, 
and it is unclear what activation chemistries (and back- 
bones) would have been present during the origin of life. 
The use of phosphorimidazolides in non-enzymatic, 
template-directed polymerization was pioneered by Orgel 
and others, who found that apparently minor substitu- 
tions on the leaving group led to large differences in re- 
activity with a 3'-OH nucleophile in an RNA primer; in 
particular, the 2-methylimidazole derivative resulted in 
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Table 4. Optimal relative concentrations of A, C, G, T or U nucleotides for minimization of mutation rate, based on rates from experimental 
system (ES) or thermodynamic calculation (TC) 



System 


Template 


Primer 


Fraction of A 


Fraction of C 


Fraction of G 


Fraction of T or U 


LS 


DNA 


DNA 


0.227 ± 0.019 


0.189 ± 0.047 


0.0575 ± 0.0002 


0.526 ± 0.066 


LS 


RNA 


RNA 


0.425 ± 0.009 


0.101 ± 0.007 


0.0432 ± 0.0001 


0.432 ± 0.001 


LS 


RNA 


DNA 


0.295 ± 0.041 


0.335 ± 0.021 


0.034 ± 0.002 


0.337 ± 0.018 


LS 


DNA 


RNA 


0.358 ± 0.011 


0.1969 ± 0.0004 


0.048 ± 0.004 


0.397 ± 0.015 


TC 


DNA 


DNA 


0.30 


0.31 


0.04 


0.35 


TC 


RNA 


RNA 


0.37 


0.37 


0.04 


0.23 


TC 


RNA 


DNA 


0.44 


0.34 


0.04 


0.18 


TC 


DNA 


RNA 


0.42 


0.42 


0.04 


0.11 



Standard deviations are calculated for optimization based on duplicate batches of experimental reaction rates. 
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Figure 4. Evolutionary consequences of the observed hierarchy of 
mutation rates. Expansion of genome upon genetic takeover and loss 
of information during reversion back to RNA. 



much more efficient polymerization, which the authors 
proposed was due to improving the geometry of the 
reaction (38^3). Polymerization efficiency was further 
enhanced by use of a 3'-NH 2 nucleophile, which reacted 
well even when the leaving group was relatively poor (e.g. 
imidazole) (44,45). While the 3'-amine nucleophile is not 
thought to be particularly prebiotically plausible, it is 
useful for laboratory study because of the fast rate of 
primer extension. 

While the experimentally observed trends are suggest- 
ive, it would be difficult to draw conclusions about the 
genetic takeover from our experimental results alone. 
We therefore sought to validate the observed trends by a 
thermodynamic model, which is independent of the acti- 
vation chemistry. To determine whether the trends in ex- 
perimental error rates reflected underlying biophysical 
properties of the duplexes rather than specific properties 
of the activation chemistry, we calculated the error rates 
for a hypothetical system at thermodynamic equilibrium, 
i.e. if the four possible fully extended products (one per- 
fectly matched and three mismatched duplexes) were 
allowed to equilibrate with one another. This analysis 
should give the thermodynamic error rates of the system, 
in contrast to error rates in an experimental implemen- 
tation (which instead depend on the kinetic pathways, 
determined by activation chemistry and/or enzymes). 
Also, in our experimental system, we inferred error rates 



of non-enzymatic polymerization from the rates of mis- 
incorporation of single nucleotides, ignoring secondary ef- 
fects such as stalling of polymerization downstream 
of a mismatch (see below). This is a simplification of 
replication of a full strand; in contrast, our theoretical 
calculations do estimate the lowest possible error rate 
for replication of full strands. An alternative approach 
might be to calculate the free energy difference between 
terminal matches and mismatches (i.e. after a single in- 
corporation). Such an approach would assume that all 
of the potential discrimination is due to the energetics of 
a single incorporation. However, the presence of a 
terminal mismatch substantially decreases downstream 
polymerization speed (18), such that the mutation fre- 
quency of a single incorporation overestimates the fre- 
quency of mutations in fully extended products. To 
include such effects in an estimate of the thermo- 
dynamic bound on mutation rates, the equilibrium of 
fully extended products (not termini alone) must be 
calculated. Without enzymes or proofreading, substrate 
discrimination is limited by the thermodynamic free 
energy difference between the correct and incorrect final 
products. We used a nearest-neighbor model to estimate 
the equilibrium distribution of products and thus infer a 
lower bound on the rate for each possible mutation 
(Figure IB). To our knowledge, this constitutes the first 
complete and quantitative exploration of the thermo- 
dynamic limit on error rates for RNA and DNA 
replication. 

As expected, all our experimental error rates lie on or 
above the thermodynamic lower bound (Figure 3A-D). 
The fact that most experimental values lie substantially 
above the lower bound may be due to at least two 
factors. First, our experiments studied a single incorpor- 
ation, but additional discrimination occurs when the 
primer is further extended because non-enzymatic poly- 
merization slows after an incorrect vs. correct incorpor- 
ation (18). Second, the thermodynamic lower bounds 
are approached in the equilibrium limit of very slow 
incorporation reactions [low W's in Equation (4); 
Supplementary Data SI], but any activation chemistry 
that gives reaction rates amenable to laboratory study is 
unlikely to be near this limit. An accurate prediction of the 
error rates in our experimental system would require a 
quantitative understanding of the microscopic kinetics of 
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hybridization and chemical bond formation, which are 
currently not known (Supplementary Data S2). 

Importantly, the thermodynamic limits on the average 
mutation rates, while lower than the experimental values, 
are also consistent with the trends represented in Figure 4, 
corroborating our qualitative conclusions on the evolu- 
tionary advantage of switching from RNA to DNA. 
Although not necessarily expected a priori, we also 
found a good correlation between the theoretical and ex- 
perimental error rates in all of the four different template- 
primer systems (Figure 3A— D). While experimental trends 
may be strongly affected by the activation chemistry and 
presence of enzymes, the thermodynamic trends are pre- 
sumably not affected by these kinetic considerations. 
Experimentally, the properties of the nucleophile and 
leaving group would affect the rate of bond formation 
[Ws in Equation (4)]. Presumably, increasing W (e.g. 
3'-amine nucleophile) implies a greater degree of kinetic 
rather than thermodynamic control, which may generally 
decrease fidelity. In other words, error rates should 
approach the thermodynamic limit as W decreases, 
because slow bond formation would allow more time to 
explore different conformations. One might further specu- 
late that changing the activation chemistry would affect 
the Ws of different reactions in a similar way, such that 
experimental incorporation and mis-incorporation fre- 
quencies should correlate with the thermodynamic limits. 
Indeed, we found this to be the case (Figure 3A-D). A 
possible interpretation of this correlation is that bond for- 
mation is relatively independent of the properties of the 
base pair or mis-pair, such that the relative reaction rates 
reflect the binding equilibria of the monomers to the 
template-primer complex. The nucleotide triphosphates 
used as substrates in biological systems, including the 
RNA polymerase ribozyme, are kinetically stable (low 
IF without enzymes), which might imply improved fidelity. 
In principle, this improvement may or may not be relevant 
because enzymes also change the reaction pathway. 
Regardless, we find a reasonable correlation between the 
incorporation and mis-incorporation rates of ribozyme- 
catalyzed RNA polymerization and non-enzymatic RNA 
polymerization (Supplementary Figure S2), suggesting 
some underlying similarity in the reaction pathways. 

In both experiments and thermodynamic calculations, 
the major mispair that contributed to the higher mutation 
rate of RNA and the hybrids were the G:U(T) wobble 
pairs. This corroborates previous observations that the 
G:U wobble pair is a greater source of error when 
copying RNA compared to the G:T wobble pair in 
DNA (46). In principle, this difference may be due to 
the backbone structure or to the different structure of U 
versus T. Given that the hybrid duplexes tend to adopt 
conformations close to the A-form helix (47,48), our 
finding that G:T is a major source of error in the DNA/ 
RNA hybrids suggests that the backbone may be more 
important than the additional methyl group of T in 
determining fidelity. Interestingly, the predominant muta- 
tions of the RNA polymerase ribozyme are also due to 
G:U mispairs (17), supporting the idea that this error is a 
feature of the RNA backbone rather than the activation 
chemistry. Why the A-form backbone might better 



accommodate this error is unclear; one may speculate 
that the possibilities include greater flexibility of the 
single-stranded template (49,50) or greater tolerance of 
non-canonical stacking interactions due to the presence 
of slide and roll in the helix (51). 

We also observed the major trends summarized in 
Figure 4 using equimolar concentrations or concentra- 
tions that were optimized to minimize the overall mutation 
rate (a situation that might evolve under selective pressure 
to reduce errors). The consistent depletion of G to 
minimize the error rate (Figure 3E and F) suggests that 
practical implementations of non-enzymatic replication 
should decrease the relative concentration of G in order 
to improve overall fidelity. Interestingly, a similar deple- 
tion of GTP would also enhance the fidelity of an RNA 
polymerase ribozyme, reducing the error rate by more 
than a factor of two, as mis-incorporation of G across 
U is quite efficient (17). Furthermore, one may note that 
the concentrations of DNA precursors in the nucleus of 
various mammalian cells (~25% A, 20% C, 5-10% G, 
45-50% T) (52) are surprisingly close to the values ob- 
tained from optimizing fidelity in non-enzymatic DNA 
replication (Figure 3E and F). While this correspondence 
may be coincidental, it is tempting to speculate that the 
observed dNTP supply might have evolved to control 
mutation rates (Supplementary Data S3). 

In non-enzymatic RNA replication, like DNA replica- 
tion, G and C templated with very good fidelity while A 
and U suffered the most errors. However, not all errors 
were equally likely, and in RNA the disparities among 
different errors were very pronounced, with G being 
incorporated across U as the dominant mutation. Over 
multiple generations of RNA replication, these disparities 
would bias toward a GC-rich genome, as U:A would tend 
to be replaced by C:G. This contrasts with DNA replica- 
tion, in which no single type of error was particularly 
dominant. Therefore, a DNA genome might have tended 
toward a more even nucleotide composition, which could 
be advantageous since heavily GC-rich sequences pose 
practical problems (e.g. difficult strand separation), and 
indeed ribozymes and aptamers have relatively even com- 
position compared to other RNAs (29). 

The observed hierarchy of mutation rates also suggests 
that non-enzymatic transcription (e.g. of ribozymes en- 
coded on a DNA genome) would be significantly more 
error-prone than genome replication. Error-prone tran- 
scription implies high phenotypic variability (53-55). 
Based on our non-enzymatic polymerization experiments, 
approximately one-quarter of nucleotides would be copied 
erroneously. Ribozyme-catalyzed polymerization could be 
more faithful, with a mutation rate <1% (56), but a sig- 
nificant proportion of transcripts would still have errors 
(e.g. for a 40-mer ribozyme, about one-third of transcripts 
would contain an error). In addition, a meta-analysis of 
two self-cleaving ribozymes showed that the majority of 
mutations (~75%) were deleterious (57), indicating that 
fitness may be greatly increased or decreased by a single 
mutation. Therefore, phenotypic variability could lead to 
an evolutionary 'look-ahead 1 effect (58): while each geno- 
type specifies a particular ribozyme sequence, it also leads 
to a cloud of transcripts nearby in RNA sequence space. 
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A given genotype may thus exhibit an overall phenotype 
influenced by its neighbors, resulting in a locally smooth- 
ened phenotypic fitness landscape (Supplementary 
Figure S3). The smoothing due to phenotypic variability 
may enhance evolvability by producing a selective benefit 
from relatively distant optima and facilitating evolution- 
ary paths across low-fitness regions, although this advan- 
tage comes at the expense of decreased fitness for 
optimized sequences due to frequent transcription errors 
(53). This decreased fitness may have been a necessary cost 
of the transition to DNA as a more stable genetic 
material. Finally, phenotypic variability may also allow 
longer genomes because of a relaxed error threshold 
(59). Under prebiotic conditions, the evolutionary advan- 
tages of evolvability and larger genomes may have out- 
weighed the cost. 

The beginning of the RNA world would have been 
dominated by polymerization chemistry, so factors such 
as the geometry of the template-primer-nucleotide 
complex were important determinants of the reaction 
rates and the mutation rate. However, the later stages 
of the RNA world could have contained sophisticated 
ribozymes that could influence the mutation rate. For 
example, the RNA polymerase ribozymes (17,56) general- 
ly have lower mutation rates (0.88^1.3%) than the 
non-enzymatic polymerization systems, suggesting that 
the ribozyme imposes additional discrimination. 
Understanding the mechanistic basis of fidelity in these 
systems is an important goal for future research. 

Mutation rates could be decreased by endergonic proof- 
reading mechanisms. It is unclear whether protein enzymes 
evolved before or after the genetic takeover. Our results 
demonstrate that error rates for non-enzymatic polymer- 
ization are severely limiting, so one may speculate that the 
increase in fidelity accompanying a transition to DNA 
might have been required for the emergence of translation 
machinery. Regardless, the basal error ratios (</>) and their 
thermodynamic bounds (</> 0 ) are also of fundamental sig- 
nificance for enzymes with proofreading capability. While 
these enzymes use chemical energy to drive one or more 
proofreading steps, the discrimination in each step is typ- 
ically based on a scheme similar to Equation 1 and would 
be limited by an analogous thermodynamic bound. 
In theory, this could decrease the error ratio to (j> , 
where qS is the basal error ratio of the interaction (which 
cannot be lower than the thermodynamic bound) and n is 
the number of proofreading steps (37). Therefore, while 
absolute fidelities could be improved by proofreading, the 
basal fidelity of the interaction is still an important factor; 
a replicator having greater basal fidelity would require less 
proofreading to achieve the same overall fidelity. 

The genetic takeover of the RNA world by DNA may 
have been influenced by several factors, including chemical 
stability and multiple evolutionary considerations. Since 
the high mutation rate of non-enzymatic polymerization 
may have presented a serious limitation to information 
storage, our data and calculations suggest that the 
switch to DNA would allow an expansion of genomic in- 
formation. The mutation profile of DNA appears to be 
relatively unbiased compared to RNA, so a DNA genome 
might be more conducive to ribozyme evolution. In 



addition, our results suggest that the switch from RNA 
to DNA would have been a one-way transition, as copying 
DNA back into RNA would cause loss of genomic infor- 
mation. At the same time, a high non-enzymatic 'tran- 
scriptional' error rate might present the advantage of 
greater evolvability. Based on the correspondence between 
our non-enzymatic data and equilibrium thermodynamic 
calculations, these trends appear to reflect intrinsic features 
of the different nucleic acid duplexes. However, while our 
results are suggestive, error rates should be investigated 
using different activation chemistries and nucleic acid 
backbones to determine the robustness of these conclu- 
sions about the genetic takeover. Eventually, the absolute 
mutation rates would change as ribozymes evolved in the 
RNA world and protein enzymes emerged; nevertheless, 
the basal fidelities may have played an important role 
early on as the first genetic systems became established 
and error correction mechanisms began to evolve. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online. 
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